[PHP] Sanitizing strings to make them URL and filename safe?
        Posted  
        
            by Xeoncross
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Xeoncross
        
        
        
        Published on 2010-04-19T15:51:44Z
        Indexed on 
            2010/04/19
            15:53 UTC
        
        
        Read the original article
        Hit count: 677
        
I am trying to come up with a function that does a good job of sanitizing certain strings so that they are safe to use in the URL (like a post slug) and also safe to use as file names. For example, when someone uploads a file I want to make sure that I remove all dangerous characters from the name.
So far I have come up with the following function which I hope solves this problem and also allows foreign UTF-8 data also.
/**
 * Convert a string to the file/URL safe "slug" form
 *
 * @param string $string the string to clean
 * @param bool $is_filename TRUE will allow additional filename characters
 * @return string
 */
function sanitize($string = '', $is_filename = FALSE)
{
    // Replace all weird characters with dashes
    preg_replace('/[^\w\-'. ($is_filename ? '*~_\.' : ''). ']+/u', '-', $string);
    // Only allow one dash separator at a time (and make string lowercase)
    return mb_strtolower(preg_replace('/--+/u', '-', $string), 'UTF-8');
}
Does anyone have any tricky sample data I can run against this - or know of a better way to safeguard our apps from bad names?
© Stack Overflow or respective owner